player 1
A Missing statements and proofs 521 A.1 Statements for Section 3.1
Let a two-player Markov game where both players affect the transition. As we have seen in Section 2.1, in the case of unilateral deviation from joint policy Let a (possibly correlated) joint policy ˆ σ . By Lemma A.1, we know that Where the equality holds due to the zero-sum property, (1). An approximate NE is an approximate global minimum. An approximate global minimum is an approximate NE.
- North America > United States > California > Orange County > Irvine (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- (2 more...)
- Information Technology > Game Theory (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
- North America > United States (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
- Information Technology > Game Theory (0.93)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Africa > Zimbabwe (0.04)
- North America > United States > California > Riverside County > Riverside (0.04)
- (2 more...)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.67)
- Leisure & Entertainment > Games (1.00)
- Information Technology (0.92)
- Energy (0.67)
- North America > Canada > Alberta (0.14)
- Asia > Singapore (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (11 more...)
- Research Report > Experimental Study (0.92)
- Research Report > New Finding (0.67)
- Education > Educational Technology > Educational Software (0.46)
- Education > Curriculum > Subject-Specific Education (0.45)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.92)
- Information Technology > Artificial Intelligence > Cognitive Science (0.92)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)
Ex ante coordination and collusion in zero-sum multi-player extensive-form games
Gabriele Farina, Andrea Celli, Nicola Gatti, Tuomas Sandholm
Recent milestones in equilibrium computation, such as the success of Libratus, show that it is possible to compute strong solutions to two-player zero-sum games in theory and practice. This is not the case for games with more than two players, which remain one of the main open challenges in computational game theory. This paper focuses on zero-sum games where a team of players faces an opponent, as is the case, for example, in Bridge, collusion in poker, and many non-recreational applications such as war, where the colluders do not have time or means of communicating during battle, collusion in bidding, where communication during the auction is illegal, and coordinated swindling in public. The possibility for the team members to communicate before game play--that is, coordinate their strategies ex ante--makes the use of behavioral strategies unsatisfactory. The reasons for this are closely related to the fact that the team can be represented as a single player with imperfect recall. We propose a new game representation, the realization form, that generalizes the sequence form but can also be applied to imperfect-recall games. Then, we use it to derive an auxiliary game that is equivalent to the original one. It provides a sound way to map the problem of finding an optimal ex-antecoordinated strategy for the team to the well-understood Nash equilibrium-finding problem in a (larger) two-player zero-sum perfect-recall game. By reasoning over the auxiliary game, we devise an anytime algorithm, fictitious team-play, that is guaranteed to converge to an optimal coordinated strategy for the team against an optimal opponent, and that is dramatically faster than the prior state-of-the-art algorithm for this problem.
- North America > United States > Texas (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (2 more...)